This paper aims to improve the Warping Planer Object Detection Network (WPOD-Net) using feature engineering to increase accuracy. What problems are solved using the Warping Object Detection Network using feature engineering? More specifically, we think that it makes sense to add knowledge about edges in the image to enhance the information for determining the license plate contour of the original WPOD-Net model. The Sobel filter has been selected experimentally and acts as a Convolutional Neural Network layer, the edge information is combined with the old information of the original network to create the final embedding vector. The proposed model was compared with the original model on a set of data that we collected for evaluation. The results are evaluated through the Quadrilateral Intersection over Union value and demonstrate that the model has a significant improvement in performance.
translated by 谷歌翻译
Recent development in the field of explainable artificial intelligence (XAI) has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in which an explanation is provided together with the model prediction in response to each query. However, XAI also opens a door for adversaries to gain insights into the black-box models in MLaaS, thereby making the models more vulnerable to several attacks. For example, feature-based explanations (e.g., SHAP) could expose the top important features that a black-box model focuses on. Such disclosure has been exploited to craft effective backdoor triggers against malware classifiers. To address this trade-off, we introduce a new concept of achieving local differential privacy (LDP) in the explanations, and from that we establish a defense, called XRand, against such attacks. We show that our mechanism restricts the information that the adversary can learn about the top important features, while maintaining the faithfulness of the explanations.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Researchers produce thousands of scholarly documents containing valuable technical knowledge. The community faces the laborious task of reading these documents to identify, extract, and synthesize information. To automate information gathering, document-level question answering (QA) offers a flexible framework where human-posed questions can be adapted to extract diverse knowledge. Finetuning QA systems requires access to labeled data (tuples of context, question and answer). However, data curation for document QA is uniquely challenging because the context (i.e. answer evidence passage) needs to be retrieved from potentially long, ill-formatted documents. Existing QA datasets sidestep this challenge by providing short, well-defined contexts that are unrealistic in real-world applications. We present a three-stage document QA approach: (1) text extraction from PDF; (2) evidence retrieval from extracted texts to form well-posed contexts; (3) QA to extract knowledge from contexts to return high-quality answers -- extractive, abstractive, or Boolean. Using QASPER for evaluation, our detect-retrieve-comprehend (DRC) system achieves a +7.19 improvement in Answer-F1 over existing baselines while delivering superior context selection. Our results demonstrate that DRC holds tremendous promise as a flexible framework for practical scientific document QA.
translated by 谷歌翻译
视力范围有限的自动驾驶机器人在避免多边形障碍的2D环境中找到了目标的途径。在发现环境图的过程中,机器人必须返回以前标记的某些位置,机器人遍历要返回的区域被定义为线段束的束序列。本文提出了一种新型算法,用于根据多次拍摄的方法找到沿线段束序列的大约最短路径。提出了该方法的三个因素,包括捆绑分区,共线条件和射击点的更新。然后,我们证明,如果共线条件成立,则确定问题的最短路径,否则,通过将方法的更新收敛到最短路径,获得的路径序列。该算法在Python中实现,一些数值示例表明,使用我们的方法的自主机器人的路径计划的运行时间比使用Li和Klette在Euclidean最短路径中使用Li和Klette的橡皮筋技术更快,Springer,53-89(2011年)(2011年) )。
translated by 谷歌翻译
无线传感器网络由随机分布的传感器节点组成,用于监视目标或感兴趣的区域。由于每个传感器的电池容量有限,因此维持连续监视的网络是一个挑战。无线电源传输技术正在作为可靠的解决方案,用于通过部署移动充电器(MC)为传感器充电传感器。但是,由于网络中出现不确定性,为MC设计最佳的充电路径是具有挑战性的。由于网络拓扑的不可预测的变化,例如节点故障,传感器的能耗率可能会显着波动。这些变化也导致每个传感器的重要性变化,在现有作品中通常被认为是相同的。我们在本文中提出了一种使用深度强化学习(DRL)方法提出新颖的自适应充电方案,以解决这些挑战。具体来说,我们赋予MC采用充电策略,该策略确定了下一个在网络当前状态上充电条件的传感器。然后,我们使用深层神经网络来参数这项收费策略,该策略将通过强化学习技术进行培训。我们的模型可以适应网络拓扑的自发变化。经验结果表明,所提出的算法的表现优于现有的按需算法的大幅度边缘。
translated by 谷歌翻译
联合学习(FL)是一种新兴技术,用于协作训练全球机器学习模型,同时将数据局限于用户设备。FL实施实施的主要障碍是用户之间的非独立且相同的(非IID)数据分布,这会减慢收敛性和降低性能。为了解决这个基本问题,我们提出了一种方法(comfed),以增强客户端和服务器侧的整个培训过程。舒适的关键思想是同时利用客户端变量减少技术来促进服务器聚合和全局自适应更新技术以加速学习。我们在CIFAR-10分类任务上的实验表明,Comfed可以改善专用于非IID数据的最新算法。
translated by 谷歌翻译
在社交媒体上传播谣言对社会构成了重要威胁,因此最近提出了各种谣言检测技术。然而,现有的工作重点是\ emph {what}实体构成谣言,但几乎没有支持理解\ emph {为什么}实体已被归类为这样。这样可以防止对检测的谣言以及对策设计的有效评估。在这项工作中,我们认为,可以通过过去检测到的相关谣言的例子来给出检测到的谣言的解释。一系列类似的谣言有助于用户概括,即了解控制谣言的探测的特性。由于通常使用特征声明的图表对社交媒体的谣言传播通常是建模的,因此我们提出了一种逐个示例的方法,鉴于谣言图,它从过去的谣言中提取了$ k $最相似和最多的子图。挑战是所有计算都需要快速评估图之间的相似性。为了在流式设置中实现该方法的有效和适应性实现,我们提出了一种新颖的图表学习技术,并报告了实施注意事项。我们的评估实验表明,我们的方法在为各种谣言传播行为提供有意义的解释方面优于基线技术。
translated by 谷歌翻译
由于相似的外观产品及其各种姿势,在人类级别的精度上设计自动结帐系统为零售商店的精度而言具有挑战性。本文通过提出具有两阶段管道的方法来解决问题。第一阶段检测到类不足的项目,第二阶段专门用于对产品类别进行分类。我们还在视频帧中跟踪对象,以避免重复计数。一个主要的挑战是域间隙,因为模型经过合成数据的训练,但对真实图像进行了测试。为了减少误差差距,我们为第一阶段检测器采用域泛化方法。此外,模型集合用于增强第二阶段分类器的鲁棒性。该方法在AI City Challenge 2022 -Track 4上进行了评估,并在测试A集合中获得F1分40美元\%$。代码在链接https://github.com/cybercore-co-ltd/aicity22-track4上发布。
translated by 谷歌翻译
在光场压缩中,基于图的编码功能强大,可以利用沿着不规则形状的信号冗余并获得良好的能量压实。然而,除了高度复杂性到处理高维图外,它们的图形构造方法对观点之间的差异信息的准确性非常敏感。在计算机软件生成的现实世界光场或合成光场中,由于渐晕效果和两种类型的光场视图之间的视图之间的巨大差异,将视差信息用于超射线投影可能会遭受不准确性。本文介绍了两种新型投影方案,导致差异信息的错误较小,其中一个投影方案还可以显着降低编码器和解码器的时间计算。实验结果表明,与原始投影方案和基于HEVC或基于JPEG PLENO的编码方法相比,使用这些建议可以大大增强超级像素的投影质量,以及率延伸性能。
translated by 谷歌翻译